High-throughput RNA sequencing: a step forward in transcriptome analysis
نویسنده
چکیده
The transcriptome plays an important role in the life of a cell. Detailed analysis of the transcriptome enables interpretation of its structure and functionality. High throughput sequencing technology signi cantly enhanced the understanding of transcriptome activity. The RNA-sequencing process currently provides the most accurate estimation of gene expression levels. Moreover, RNA-seq allows detection of isoform structure and novel RNA types along with transcription process details such as strand-speci city and much more. The rst chapter of this thesis describes the history of transcriptome exploration and e ective methods of RNA-seq application. Nevertheless, all steps of RNA-seq process can produce a number of biases that in uence the investigation results. Some typical errors appearing during ligation and ampli cation procedures might be present in any high throughput sequencing experiment, while other biases occur only in cDNA synthesis or are speci c for transcriptome activity. Quality control of sequencing data is important to verify and correct the analysis results. The second chapter of this thesis is devoted to the explanation of these issues and introduces a novel tool, Qualimap2. This instrument computes detailed statistics and presents a number of plots based on RNA-seq alignment and counts data processing. The generated results enable detection of problems that are speci c to RNA-seq experiments. Notably, the tool supports analysis of multiple samples in various conditions. Qualimap2 was faithfully compared to other available tools and demonstrated superior functionality in multi-sample quality control. Importantly, RNA-seq can be applied in a relatively novel research area: detection of chimeric transcripts and fusion genes occurring due to genomic rearrangement. Since fusions are related to cancer, their discovery is important not only for science, but also allows medical use of RNA-seq. The third chapter is devoted to the current status of this approach and illustrates a novel toolkit called InFusion, which provides a number of novelties in chimera discovery from RNA-seq data such as detection of fusions arising from the combination of a gene and an intronic or intergenic region. Moreover, strand-speci city of expressed fusion transcripts can be detected and reported. InFusion was compared in detail to a number of other existing tools based on simulated and real datasets and demonstrated higher precision and recall. Overall, RNA-sequencing technology goes further and more specialized analysis abilities are becoming available. New applications of RNA sequencing and future directions of research are discussed in the last chapter.
منابع مشابه
Transcriptome Sequencing of Guilan Native Cow in Comparison with bosTau4 Reference Genome
RNA-sequencing is a new method of transcriptome characterization of organisms. Based on identity and relatedness, there are large genetic variations among different cattle breeds. The goal of the current study was to sequence the transcriptome of Guilan native cow and compare with available reference genome using RNA-sequencing method. Blood samples were collected from 14 Guilan native cows and...
متن کاملI-13: Transcriptome Dynamics of Human and Mouse Preimplantation Embryos Revealed by Single Cell RNA-Sequencing
Background: Mammalian preimplantation development is a complex process involving dramatic changes in the transcriptional architecture. However, it is still unclear about the crucial transcriptional network and key hub genes that regulate the proceeding of preimplantation embryos. Materials and Methods: Through single-cell RNAsequencing (RNA-seq) of both human and mouse preimplantation embryos, ...
متن کاملHigh-throughput transcriptome sequencing: methods development and data analysis of large expression data sets
Template switching (TS) has been an inherent mechanism of reverse transcriptase, which has been exploited in several transcriptome analysis methods, such as CAGE, RNA-Seq and short RNA sequencing. TS is an attractive option, given the simplicity of the protocol, which does not require an adaptor mediated step and thus minimizes sample loss. As such, it has been used in several studies that deal...
متن کاملTranscriptome analysis of the freshwater pearl mussel, Hyriopsis cumingii (Lea) using illumina paired-end sequencing to identify genes and markers
The transcriptome of triangle sail mussel Hyriopsis cumingii (Lea) using Illumina paired-end sequencing technology was conducted and analyzed. Equal quantities of total RNA isolated from six tissues, including gonad, hepatopancreas, foot, mantel, gill and adductor muscle, were pooled to construct a cDNA library. A total of 58.09 million clean reads with 98.48 % Q20 bases were generated. Cluster...
متن کاملClustering of Short Read Sequences for de novo Transcriptome Assembly
Given the importance of transcriptome analysis in various biological studies and considering thevast amount of whole transcriptome sequencing data, it seems necessary to develop analgorithm to assemble transcriptome data. In this study we propose an algorithm fortranscriptome assembly in the absence of a reference genome. First, the contiguous sequencesare generated using de Bruijn graph with d...
متن کامل